[KVCache] Support environment variable overrides for AttentionStore c…#7455
[KVCache] Support environment variable overrides for AttentionStore c…#7455jackyYang6 wants to merge 1 commit intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7455 +/- ##
==========================================
Coverage ? 73.47%
==========================================
Files ? 398
Lines ? 54979
Branches ? 8613
==========================================
Hits ? 40398
Misses ? 11873
Partials ? 2708
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
9fbff4a to
8407b8a
Compare
|
|
||
| try: | ||
| self.config.namespace = os.getenv("MODEL_ID", self.config.namespace) | ||
| self.config.pod_name = os.getenv("FD_POD_NAME", self.config.pod_name) |
There was a problem hiding this comment.
这个Pod Name 需要和上报给IC 和 IM 的pod name 对齐吗?
8407b8a to
2d24f4b
Compare
2d24f4b to
7755c6b
Compare
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-20 18:01:43
📋 Review 摘要
PR 概述:为 AttentionStore 的 namespace、pod_name、model_version 新增环境变量覆盖能力,并根据 ENABLE_EP_DP_IN_FD 拼接 splitwise_role 和 dp_id 到 pod_name。
变更范围:cache_manager/ — AttentionStore 配置初始化
影响面 Tag:KVCache
📝 PR 规范检查
PR 描述中的代码示例与实际 diff 不一致:描述仅展示了 AS_NAMESPACE、AS_POD_NAME、AS_MODEL_VERSION 三个环境变量覆盖,但实际代码还新增了 ENABLE_EP_DP_IN_FD 控制的 pod_name 拼接逻辑以及 splitwise_role 字段,建议同步更新 PR 描述。
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | attention_store.py:70 |
int(os.getenv("ENABLE_EP_DP_IN_FD", ...)) 对非数字字符串会抛 ValueError,导致初始化失败 |
总体评价
环境变量覆盖的设计思路合理,但 ENABLE_EP_DP_IN_FD 的解析方式存在健壮性问题,需要修复。
| self.config.namespace = os.getenv("AS_NAMESPACE", self.config.namespace) | ||
| self.config.pod_name = os.getenv("AS_POD_NAME", self.config.pod_name) | ||
| if int(os.getenv("ENABLE_EP_DP_IN_FD", "1")): | ||
| self.config.pod_name = self.config.pod_name + "_" + self.config.splitwise_role + "_" + str(self.config.dp_id) |
There was a problem hiding this comment.
🔴 Bug int(os.getenv("ENABLE_EP_DP_IN_FD", "1")) 当环境变量被设置为非数字字符串(如 "true"、"false"、"enabled")时,int() 会抛出 ValueError。
由于此处位于 try ... except Exception 块内,异常会被捕获并导致整个 AttentionStoreSDK 初始化失败,且错误信息会令人困惑(不会提示是环境变量格式问题)。
建议使用更健壮的解析方式:
def _str_to_bool(val: str) -> bool:
return val.strip().lower() not in ("0", "false", "no", "off", "")
if _str_to_bool(os.getenv("ENABLE_EP_DP_IN_FD", "1")):或者至少增加异常处理:
try:
enable_ep_dp = int(os.getenv("ENABLE_EP_DP_IN_FD", "1"))
except ValueError:
enable_ep_dp = 1
logger.warning("ENABLE_EP_DP_IN_FD 环境变量值无效,使用默认值 1")
if enable_ep_dp:
Motivation
AttentionStore's
namespace,pod_name, andmodel_versionare currently only configurable via constructor arguments. In K8s deployments, these values are typically derived from Pod environment variables (e.g.,AS_NAMESPACE,AS_POD_NAME,AS_MODEL_VERSION). Without environment variable overrides, deployment configuration is inflexible.Modifications
Modified file:
fastdeploy/cache_manager/transfer_factory/mooncake_store/attention_store.pyAdded environment variable overrides for
AttentionStoreConfigfields before SDK initialization inAttentionStore.__init__:Usage or Command
Set the corresponding environment variables before launching the service:
Accuracy Tests
Not applicable — this change does not affect model outputs.
Checklist
pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.